Database Management System
Drawbacks of using file systems to store data
1. Data redundancy and inconsistency
- Multiple file formats, duplication of information in different files
2. Difficulty in accessing data
- Need to write a new program to carry out each new task
3. Data isolation - multiple files and formats
4. Integrity problems
- Integrity constraints should be implemented in program code rather than stated explicitly
- Hard to add new constraints or change existing ones
5. Atomicty of updates
- Failures may leave the database in an inconsistent state
6. Concurrent access by multiple users
- Concurrent access needed for performance
- Uncontrolled concurrent access can lead to inconsistencies.
7. Security problems
- Hard to do access control
Levels of abstraction
- Physical Level: Described how a record is stored
- Logical Level: Describe the data and relations
- View Level: Hiding details of data types
Instances and Schemas
- Schema: Logical structure/plan of the database
- Physical Schema
- Logical Schema
- Instance: Content of the database in a particular point in time
Physical Data independence
Ability to modify physical schema without changing logical schema
Data Models
- Relational Model
- Entity-Relationship Data model
- Object-Based data model
- Semistructured data model (XML)
Relational Model
Data Manipulation Language (DML)
- Languages for accessing and manipulating data
- DML is also known as query language
- eg: SQL
- Two classes
- Procedural: Tell what data and how to get data
- Declarative: Tell what is needed without telling how
Data Definition Language (DDL)
- Used to define database schema
create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
- DDL compiler generates a set of table templates stored in a data dictionary
- Data dictionary containes metadata
- Database Schema
- Integrity constraints
- Primary key
- References
- Authorization
Database Design
- Logical Design: Designing schema
- Physical Design: Design Physical layout of database
Design Approaches
- Normalization Theory
- Entity Relationship model
- Modeled as a collection of entities and relationships
Entity Relationship model
Object-Relational Data model
- Relational model uses flat and atomic values
- eg: numbers, strings
- Object-relational model uses object orientation and deal with added data types.
- Allow non atomic types including nested relations
XML : Extensible Markup Language
- Ability to create nested tag structures
Database Engine
- Storage Manager
- Query Processing
- Transaction Manager
Storage Management
- Manage efficient storage and retrieval of data
- Provide an interface between low level data stored in the database and the application programs.
Query Processing
- Parsing and translation
- Optimization
- Evaluation
Alternative ways of evaluating a given query.
- Equivalent expressions
- Different algorithms for each operation
Transaction Management
- Transaction : A collection of operations that performs a single logical function
- Transaction management make sure that the database stays in a consistent state despite system / transaction failures.
- Concurrency control controls the interaction among concurrent transactions
Database Users and Administrators
- Naive Users
- Application Programmers
- Sophisticated users (analysts)
- Database Administrators
Database Architecture
Architecture of a database depends on the underlying system
- Centralized
- Client-Server
- Parallel (multi-processor)
- Distributed